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Description 

[0001 J This invention relates to the field of receptors belonging to the superfamily of nuclear hormone receptors, in 
particular to steroid receptors. The invention relates to DNA encoding a novel steroid receptor, the preparation of said 
5 receptor, the receptor protein, and the uses thereof. 

[0002] Steroid hormone receptors belong to a superfamily of nuclear hormone receptors involved in ligand-dependent 
transcriptional controf of gene expression. In addition, this superfamily consists of receptors for non-steroid hormones 
such as vitamine D, thyroid hormones and retinoids (Gigudre et al, Nature 330, 624-629, 1 987; Evans, R.M., Science 
240, 889-895,1988). Moreover, a range of nuclear receptor-like sequences have been identified which encode socalled 
io 'orphan' receptors: these receptors are structurally related to and therefore classified as nuclear receptors, although 
no putative ligands have been identified yet {B.W. O'Malley, Endocrinology 125, 1119-1170, 1989; D.J. Mangelsdorf 
and R.M. Evans, Cell, 83, 841-850, 1995). 

[0003] The superfamily of nuclear hormone receptors share a modular structure in which six distinct structural and 
functional domains, A to F, are displayed (Evans, Science 240, 889-895, 1988). A nuclear hormone receptor is char- 
ts acterized by a variabel N-termina! region (domain A/B), followed by a centrally located, highly conserved DNA-binding 
domain (hereinafter referred to as DBD; domain C), a variable hinge region (domain D), a conserved ligand-binding 
domain (herein after referred to as LBD; domain E) and a variable C-terminal region (domain F). 

[0004] The N-terminal region, which is highly variable in size and sequence, is poorly conserved among the different 
members of the superfamily. This part of the receptor is involved in the modulation of transcription activation (Bocquel 
20 et al, Nucl. Acid Res., 1 7, 2581 -2595, 1 989; Tora et a!, Cell 59, 477-487, 1 989). 

[0005] The DBD consists of approximately 66 to 70 amino acids and is responsible for DNA-binding activity: it targets 
the receptor to specific DNA sequences called hormone responsive elements (hereinafter referred to as HRE) within 
the transcription control unit of specific target genes on the chromatin (Martinez and Wahli, In 'Nuclear Hormone Re- 
ceptors', Acad. Press, 125-153, 1991). 

25 [0006] The LBD is located in the C-terminal part of the receptor and is primarily responsible for ligand binding activity. 

In this way, the LBD is essential for recognition and binding of the hormone ligand and, in addition possesses a tran- 
scription activation function, thereby determining the specificity and selectivity of the hormone response of the receptor. 
Although moderately conserved in structure, the LBD's are known to vary considerably in homology between the indi- 
vidual members of the nuclear hormone receptor superfamily (Evans, Science 240, 889-895, 1988; P.J. Fuller, FASEB 
30 J., 5, 3092-3099, 1991; Mangelsdorf et al, Cell, Vol. 83, 835-839, 1995). 

[0007] Functions present in the N-terminal region, LBD and DBD operate independently from each other and it has 
been shown that these domains can be exchanged between nuclear receptors (Green et al, Nature, Vol. 325, 75-78, 
1987). This results in chimeric nuclear receptors, such as described for instance in WO-A-8905355. 

[0008] When a hormone ligand for a nuclear receptor enters the cell by diffusion and is recognized by the LBD, it 
35 will bind to the specific receptor protein, thereby initiating an allosteric alteration of the receptor protein. As a result of 
this alteration the ligand/receptor complex switches to a transcriptionally active state and as such is able to bind through 
the presence of the DBD with high affinity to the corresponding HRE on the chromatin DNA (Martinez and Wahli, 
'Nuclear Hormone Receptors', 125- 153, Acad. Press, 1991). In this way the ligand/receptor complex modulates ex- 
pression of the specific target genes. The diversity achieved by this family of receptors results from their ability to 
4 o respond to different ligands. 

[0009] The steroid hormone receptors are a distinct class of the nuclear receptor superfamily, characterized in that 
the ligands are steroid hormones. The receptors for glucocorticoids (GR), mineralcorticoids (MR), progestins (PR), 
androgens (AR) and estrogens (ER) are classical steroid receptors. Furthermore, the steroid receptors have the unique 
ability upon activation to bind to palindromic DNA sequences, the so-called HRE's, as homodimers. The GR, MR, PR 
45 and AR recognize the same DNA sequence, while the ER recognizes a different DNA sequence. (Beato et al, Cell, 
Vol. 83, 851 -857, 1 995). After binding to DNA, the steroid receptor is thought to interact with components of the basal 
transcriptional machinery and with sequence-specific transcription factors, thus modulating the expression of specific 
target genes. 

[0010] Several HRE's have been identified, which are responsive to the hormone/receptor complex. These HRE's 
50 are situated in the transcriptional control units of the various target genes such as mammalian growth hormone genes 
(responsive to glucocorticoid, estrogen, testosterone), mammalian prolactin genes and progesterone receptor genes 
(responsive to Estrogen), avian ovalbumin genes (responsive to progesterone), mammalian metallothionein gene (re- 
sponsive to glucocorticoid) and mammalian hepatic a^-globulin gene (responsive to estrogen, testosterone, gluco- 
corticoid). 

55 [001 1 ] The steroid hormone receptors have been known to be involved in embryonic development, adult homeostasis 

as well as organ physiology. Various diseases and abnormalities have been ascribed to a disturbance in the steroid 
hormone pathway. Since the steroid receptors exercise their influence as hormone-activated transcriptional modulators, 
it can be anticipated that mutations and defects in these receptors, as well as overstimulation or blocking of these 
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receptors might be the underlying reason for the altered pattern. A better knowledge of these receptors, their mecha- 
nism of action and of the ligands which bind to said receptor might help to create a better insight in the underlying 
mechanism of the hormone signal transduction pathway, which eventually will lead to better treatment of the diseases 
and abnormalities linked to altered hormone/receptor functioning. 
s [0012] For this reason cDNA's of the steroid and several other nuclear receptors of several mammalians, including 
humans, have been isolated and the corresponding amino acid sequences have been deduced, such as for example 
the human steroid receptors PR, ER, GR, MR, and AR, the human non-steroid receptors for vitamine D, thyroid hor- 
mones, and retinoids such as retinol A and retinoic acid. In addition, cDNA's encoding well over 1 00 mammalian orphan 
receptors have been isolated, for which no putative ligands are known yet (Mangelsdorf et a!, Cell, Vol.83, 835-839, 
10 1 995). However, there is still a great need for the elucidation of other nuclear receptors in order to unravel the various 

roles these receptors play in normal physiology and pathology. 

[0013] The present invention provides for such a novel nuclear receptor. More specifically, the invention provides for 
an isolated estrogen receptor having an N-terminal domain, a DNA-binding domain, and a ligand-binding domain, 
wherein the amino acid sequence of said DNA-binding domain exhibits at (east 80% homology with the amino acid 
is sequence shown in SEQ ID NO:3 and the amino acid sequence of said ligand-binding domain of said estrogen receptor 

exhibits at least 70 % homology with the amino acid sequence shown in SEQ ID NO:4, 
provided that the estrogen receptor does not have the amino acid sequence: 

MTFYS PAVMN YSVPG STSNL DGGPV RLSTS PNVLW PTSGH LSPLA 

THCQS SLLYA EPQKS PWCEA RSLEH TLPVN RETLK RKLSG SSCAS 

PVTSP NAKRD AHFCP VCSDY ASGYH YGVWS CEGCK AFFKR SIQGH 
NDYIC PATNQ CTIDK NRRKS CQACR LRKCY EVGMV KCGSR RERCG 

YRIVR RQRSS SEQVH CLSKA KRNGG HAPRV KELLL STLSP EQLVL 

TLLEA EPPNV LVSRP SMPFT EASMM MSLTK LADKE LVHMI GWAKK 

IPGFV ELSLL DQVRL LESCW MEVLM VGLMW RSIDH PGKLI FAPDL 

VLDRD EGKCV EGXLE IFDML LATTS RFREL KLQHK EYLCV KAMIL 

LNSSM YPLAS ANQEA ESSRK LTHLL NAVTD ALVWV IAKSG ISSQQ 

QSVRL ANLLM LLSHV RHISN KGMEH LLSMK CKNW PVYDL LLEML 

NAHTL RGYKS SISGS ECSST EDSKN KESSQ NLQSQ . 

35 

The disclaimer relates to the non-prepublished patent application WO 97/09348. 

[0014] Hereby, the present invention provides for a novel steroid receptor, having estrogen mediated activity. Said 
novel steroid receptors are novel estrogen receptors, which are able to bind and be activated by, for example, estradiol, 
estrone and estriol. According to the present invention it has been found that a novel estrogen receptor is expressed 
40 as an 8 kb transcript in human thymus, spleen, peripheral blood lymphocytes (PBLs), ovary and testis. Furthermore, 
additional transcripts have been identified. Another transcript of approximately 10 kb was identified in ovary, thymus 
and spleen. In testis, an additional transcript of 1.3 kb was detected. These transcripts are probably generated by 
alternative splicing of the gene encoding the novel estrogen receptor according to the invention. 

[0015] Cloning of the cDNA’s encoding the novel estrogen receptors according to the invention revealed that several 
45 splicing variants of said receptor can be distinguished. At the protein level, these variants differ only at the C-terminal 
part. 

[0016] cDNA encoding an ER has been isolated (Green, et al, Nature 320, 134-139, 1986; Greene et al, Science 
231,11 50-11 54, 1 986), and the corresponding amino acid sequence has been deduced. This receptor and the receptor 
according to the present invention, however, are distinct, and encoded for by different genes with different nucleic acid 
50 sequences. Not only do the ER of the prior art (hereinafter referred to as classical ER) and the ER according to the 
present invention differ in amino acid sequence, they also are located on different chromosomes. The gene encoding 
the classical ER is located on chromosome 6, whereas the gene encoding the ER according to the invention was found 
to be located on chromosome 14. The ER according to the invention furthermore distinguishes itself from the classical 
receptor in differences in tissue distribution, indicating that there maybe important differences between these receptors 
55 at the level of estrogenic signalling. 

[001 7] In addition, two orphan receptors, ERRa and ERRS, having an estrogen receptor related structure have been 
described (Gigudre et al, Nature 331, 91-94, 1988). These orphan receptors, however, have not been reported to be 
able to bind estrodial or any other hormone that binds to the classical ER, and other ligands which bind to these 
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receptors have not been found yet. The novel estrogen receptor according to the invention distinguishes itself clearly 
from these receptors since it was found to bind estrogens. 

[0018] The fact that a novel ER according to the invention has been found is all the more surprising, since any 
suggestion towards the existence of additional estrogen receptors was absent in the scientific literature: neither the 
5 isolation of the classical ER nor the orphan receptors ERRa and ERRp suggested or hinted towards the presence of 
additional estrogen receptors such as the receptors according to the invention. The identification of additional ER's 
could be a major step forward for the existing clinical therapies, which are based on the existence of one ER and as 
such ascribe all estrogen mediated abnormalities and/or diseases to this one receptor. The receptors according to the 
invention will be useful in the development of hormone analogs that selectively activate either the classical ER or the 
io novel estrogen receptor according to the invention. This should be considered as one of the major advantages of the 
present invention. 

[0019] Thus, in one aspect, the present invention provides for isolated cDNA encoding a novel steroid receptor. In 
particular, the present invention provides for isolated cDNA encoding the novel estrogen receptor as defined above. 
[0020] According to this aspect of the present invention, there is provided an isolated DNA encoding a steroid receptor 
is protein having an N-terminal domain, a DNA-binding domain and a ligand-binding domain, wherein the amino acid 
sequence of said DNA-binding domain of said receptor protein exhibits at least 80% homology with the amino acid 
sequence shown in SEQ ID NO;3, and the amino acid sequence of said ligand-binding domain of said receptor protein 
exhibits at least 70% homology with the amino acid sequence shown in SEQ ID NO:4. 

[0021] In particular, the isolated DNA encodes a steroid receptor protein having an N-terminal domain, a DNA-binding 
20 domain and a ligand-binding domain, wherein the amino acid sequence of said DNA-binding domain of said receptor 
protein exhibits at least 90%, preferably 95%, more preferably 98%, most preferably 100% homology with the amino 
acid sequence shown in SEQ ID NO:3. 

[0022] More particularly, the isolated DNA encodes a steroid receptor protein having an N-terminal domain, a DNA- 
binding domain and a ligand-binding domain , wherein the amino acid sequence of said ligand-binding domain of said 
25 receptor protein exhibits at least 75%, preferably 80%, more preferably 90%, most preferably 1 00% homology with the 
amino acid sequence shown in SEQ ID NO:4. 

[0023] A preferred isolated DNA according to the invention encodes a steroid receptor protein having the amino acid 
sequence shown in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:21 or SEQ ID NO:25. 

[0024] A more preferred isolated DNA according to the invention is an isolated DNA comprising a nucleotide se- 
30 quence shown in SEQ ID NOT , SEQ ID NO:2, SEQ ID NO:20 or SEQ ID NO:24. 

[0025] The DNA according to the invention may be obtained from cDNA. Alternatively, the coding sequence might 
be genomic DNA, or prepared using DNA synthesis techniques. 

[0026] The DNA according to the invention will be very useful for in vivo expression of the novel receptor proteins 
according to the invention in sufficient quantities and in substantially pure form. 

35 [0027] In another aspect of the invention, there is provided for a steroid receptor comprising the amino acid sequence 

encoded by the above described DNA molecules. 

[0028] The steroid receptor according to the invention has an N-terminal domain , a DNA-binding domain and a ligand- 
binding domain, wherein the amino acid sequence of said DNA-binding domain of said receptor exhibits at least 80% 
homology with the amino acid sequence shown in SEQ ID NO:3, and the amino acid sequence of said iigand-binding 
40 domain of said receptor exhibits at least 70% homology with the amino acid sequence shown in SEQ ID NO:4. 

[0029] I n particular, the steroid receptor according to the invention has an N-terminal domain, a DNA-binding domain 
and a ligand-binding domain, wherein the amino acid sequence of said DNA-binding domain of said receptor exhibits 
at least 90%, preferably 95%, more preferably 98%, most preferably 100% homology with the amino acid sequence 
shown in SEQ ID NO:3. 

45 [0030] More particular, the steroid receptor according to the invention has an N-terminal domain, a DNA-binding 

domain and a ligand-binding domain, wherein the amino acid sequence of said ligand-binding domain of said receptor 
exhibits at least 75%, prefearbly 80%, more preferably 90%, most preferably 100% homology with the amino acid 
sequence shown in SEQ ID NO:4. 

[0031] It will be clear for those skilled in the art that also steroid receptor proteins comprising combined DBD and 
so |_BD preferences and DNA encoding such receptors are subject of the invention, 

[0032] Preferably, the steroid receptor according to the invention comprises an amino acid sequence shown in SEQ 
ID NO:5, SEQ ID NO:6, SEQ ID NO:21 or SEQ ID NO:25. 

[0033] Also within the scope of the present invention are steroid receptor proteins which comprise variations in the 
amino acid sequence of the DBD and LBD without loosing their respective DNA-binding or ligand-binding activities. 
55 The variations that can occur in those amino acid sequences comprise deletions, substitutions, insertions, inversions 
or additions of (an) amino acid(s) in said sequence, said variations resulting in amino acid difference(s) in the overall 
sequence. It is well known in the art of proteins and peptides that these amino acid differences lead to amino acid 
sequences that are different from, but still homologous with the native amino acid sequence they have been derived 
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from. 

[0034] Amino acid substitutions that are expected not to essentially alter biological and immunological activities, 
have been described in for example Dayhof, M.D., Atlas of protein sequence and structure, Nat, Biomed. Res. Found., 
Washington D.C., 1978, vol. 5, suppl. 3. Amino acid replacements between related amino acids or replacements which 
5 have occurred frequently in evolution are, inter alia Ser/Ala, Ser/Gly, Asp/Gly, Arg/Lys, Asp/Asn, lle/Val. Based on this 
information Lipman and Pearson developed a method for rapid and sensitive protein comparison (Science 227, 
1435-1441, 1985) and determining the functional similarity between homologous polypeptides. 

[0035] Variations in amino acid sequence of the DBD according to the invention resulting in an amino acid sequence 
that has at least 80% homology with the sequnece of SEQ ID NO:3 will lead to receptors still having sufficient DNA 
io binding activity. Variations in amino acid sequence of the LBD according to the invention resulting in an amino acid 
sequence that has at least 70% homology with the sequnece of SEQ ID NO:4 will lead to receptors still having sufficient 
ligand binding activity. 

[0036] Homology as defined herein is expressed in percentages, determined via PCGENE. Homology is calculated 
as the percentage of identical residues in an alignment with the sequence according to the invention. Gaps are allowed 
is to obtain maximum alignment. 

[0037] Comparing the amino acid sequences of the classical ER and the ER's according to the invention revealed 
a high degree of similarity within their respective DBD's, The conservation of the P-box (amino acids E-G-X-X-A) which 
is responsible for the actual interactions of the classical ER with the target DNA element (Zilliacus et al., MoI.Endo. 9, 
389, 1995; Glass, End. Rev. 15, 391 , 1994), is indicative for a recognition of estrogen responsive elements (ERE's) by 
20 the ER's according to the invention. The receptors according to the invention indeed showed ligand-dependent trans- 
activation on ERE-containing reporter constructs. Therefore, the classical ER and the novel ER's according to the 
invention may have overlapping target gene specificities. This could indicate that in tissues which co-express both 
respective ER's, these receptors compete for ERE's. The ER's according to the invention may regulate transcription 
of target genes differently from classical ER regulation or could simply block classical ER functioning by occupying 

25 estrogen responsive elements. Alternatively, transcription might be influenced by heterodimerization of the different 

receptors. 

[0038] Thus, a preferred steroid receptor according to the invention comprises the amino acid sequence E-G-X-X- 
A within the P box of the DNA binding domain, wherein X stands for any amino acid. Also within the scope of the 
invention is isolated DNA encoding such a receptor. 

30 [0039] Methods to prepare the receptors according to the invention are well known in the art (Sambrook et al., Mo- 
lecular Cloning; a Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1989). The most 

practical approach is to produce these receptors by expression of the DNA encoding the desired protein. 

[0040] A wide variety of host cell and cloning vehicle combinations may be usefully employed in cloning the nucleic 
acid sequence coding for the receptor of the invention. For example, useful cloning vehicles may include chromosomal, 
os non-chromosomal and synthetic DNA sequences such as various known bacterial plasmids and wider host range 
plasmids and vectors derived from combinations of plasmids and phage or virus DNA. Useful hosts may include bac- 
terial hosts, yeasts and other fungi, plant or animal hosts, such as Chinese Hamster Ovary (CHO) cells or monkey 
cells and other hosts (with exception of linman beings). 

[0041] Vehicles for use in expression of the ligand-binding domain of the present invention will further comprise 
40 control sequences operably linked to the nucleic acid sequence coding for the ligand-binding domain. Such control 
sequences generally comprise a promoter sequence and sequences which regulate and/or enhance expression levels. 
Furthermore an origin of replication and/or a dominant selection marker are often present in such vehicles. Of course 
control and other sequences can vary depending on the host cell selected. 

[0042] Techniques for transforming or transfecting host cells are quite known in the art (see, for instance, Sambrook 
45 et al, , Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, 1 989). 

[0043] Recombinant expression vectors comprising the DNA of the invention as well as cells transformed with said 
DNA or said expression vector also form part of the present invention. 

[0044] Since steroid receptors have three domains with different functions, which are more or less independent, it 
is possible that all three functional domains have been derived from different members of the steroid receptor super- 
50 family. 

[0045] Molecules which contain parts having a different origin are called chimeric. Such a chimeric receptor com- 
prising the ligand-binding domain and/or the DNA-binding domain of the invention may be produced by chemical link- 
age, but most preferably the coupling is accomplished at the DNA level with standard molecular biological methods 
by fusing the nucleic acid sequences encoding the necessary steroid receptor domains. Hence, DNA encoding the 
55 chimeric receptor proteins according to the invention are also subject of the present invention. 

[0046] Such chimeric proteins can be prepared by transfecting DNA encoding these chimeric receptor proteins to 
suitable host cells and culturing these cells under suitable conditions. 

[0047] It is extremely practical if, next to the information for the expression of the steroid receptor, also the host cell 
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is transformed or transfected with a vector which carries the information for a reporter molecule. Such a vector coding 
for a reporter molecule is characterized by having a promoter sequence containing one or more hormone responsive 
elements (HRE) functionally linked to an operative reporter gene. Such a HRE is the DNA target of the activated steroid 
receptor and, as a consequence, it enhances the transcription of the DNA coding for the reporter molecule. In in vivo 
5 settings of steroid receptors the reporter molecule comprises the cellular response to the stimulation of the ligand. 
However, it is possible in vitro to combine the ligand-binding domain of a receptor to the DNA binding domain and 
transcription activating domain of other steroid receptors, thereby enabling the use of other HRE and reporter molecule 
systems. One such a system is established by a HRE presented in the MMTV-LTR (mouse mammary tumorvirus long 
terminal repeat sequence in connection with a reporter molecule like the firefly luciferase gene or the bacterial gene 
io for CAT (chloramphenicol transferase). Other HRE's which can be used are the rat oxytocin promotor, the retinoic acid 
responsive element, the thyroid hormone responsive element, the estrogen responsive element and also synthetic 
responsive elements have been described (for instance in Fuller, ibid. page. 3096). As reporter molecules next to CAT 
and luciferase p-galactosidase can be used. 

[0048] Steroid hormone receptors and chimeric receptors according to the present invention can be used for the in 
is vitro identification of novel ligands or hormonal analogs. For this purpose binding studies can be performed with cells 
transformed with DNA according to the invention or an expression vector comprising DNA according to the invention, 
said cells expressing the steroid receptors or chimeric receptors according to the Invention. 

[0049] The novel steroid hormone receptor and chimeric receptors according to the invention as well as the ligand- 
binding domain of the invention, can be used in an assay for the identification of functional ligands or hormone analogs 
20 for the nuclear receptors. 

[0050] Thus, the present invention provides for a method for identifying functional ligands for the steroid receptors 
and chimeric receptors according to the invention, said method comprising the steps of 

a) introducing into a suitable host cell 1)DNA or an expression vector according to the invention, and 2) a suitable 

25 reporter gene functionally linked to an operative hormone response element, said HRE being able to be activated 

by the DNA-binding domain of the receptor protein encoded by said DNA; 

b) bringing the host cell from step a) into contact with potential ligands which will possibly bind to the ligand-binding 
domain of the receptor protein encoded by said DNA from step a); 

c) monitoring the expression of the receptor protein encoded by said reporter gene of step a). 

30 

[0051] If expression of the reporter gene is induced with respect to basic expression (without ligand), the functional 
ligand can be considered as an agonist; if expression of the reporter gene remains unchanged or is reduced with 
respect to basic expression, the functional ligand can be a suitable (partial) antagonist. 

[0052] For performing such kind of investigations host cells which have been transformed or transfected with both 
35 a vector encoding a functional steroid receptor and a vector having the information for a hormone responsive element 
and a connected reporter molecule are cultured in a suitable medium. After addition of a suitable ligand, which will 
activate the receptor the production of the reporter molecule will be enhanced, which production simply can be deter- 
mined by assays having a sensitivity for the reporter molecule. See for instance WO-A-8B03168. Assays with known 
steroid receptors have been described (for instance S. Tsai et al., Cell 57, 443, 1 989; M. Meyer et al., Cell 57, 433, 1 989). 

40 

Legends to the figures 
Figure 1. 

45 [0053] Northern analysis of the novel estrogen receptor (ERp). Two different multiple tissue Northern blots (Clontech) 

were hybridised with a specific probe for ERp (see examples). Indicated are the human tissues the RNA originated 
from and the position of the size markers in kilobases (kb). 

Figure 2. 

50 

[0054] Histogram showing the 3- to 4-fold stimulatory effect of 17p-estradiol, estriol and estrone on the luciferase 
activity mediated by ERp. An expression vector encoding ERp was transiently transfected into CHO cells together with 
a reporter construct containing the rat oxytocin promoter in front of the firefly luciferase encoding sequence (see ex- 
amples). 

55 

Figure 3. 

[0055] Effect of 1 7p-estradiol (E2) alone or in combination with the anti-estrogen ICI-1 64384 (ICI) on ERa and ERp. 
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Expression constructs for ERa (the classical ER) and ERp were transiently transfected Into CHO cells together with 
the rat oxytocin promoter-lucif erase reporter construct described in the examples. Luciferase activities were determined 
in triplicate and normalised for transfection efficiency by measuring {3-galactosidase in the same lysate. 

5 Figure 4. 

[0056] Expression of ERa and ER{3 in a number of cell lines determined by RT-PCR analysis (see examples). The 
cell lines used were derived from different tissues/cell types: endometrium (ECC1 , Ishikawa, HEC-1 A, RL95-2); oste- 
osarcoma (SAOS-2, U2-OS, HOS, MG63); breast tumours (MCF-7, T47D), endothelium (HU V-EC-C, BAEC-1 ); smooth 
io muscle (HISM, PAC-1, A7R5, A10, RASMC, CavaSMC); liver (HepG2); colon (CaCo2); and vagina (Hs-760T, SW-954). 
[0057] All cell lines were human except for PAC-1 , A7R5, A1 0 and RASMC which are of rat origin, BAEC-1 which is 
of bovine origin and CavaSMC which is of guinea pig origin. 

Figure 5. 

15 

[0058] Transactivation assay using stably transfected CHO cell lines expressing ERa or ERp together with the rat 
oxytocin -luciferase estrogen-responsive reporter (see examples for details). Hormone-dependent transactivation 
curves were determined for 1 7[3-estradiol and for Org4094. For the ER antagonist raloxifen, cells were treated with 2 
x 10' 10 mol/L 1 7p-estradiol together with increasing concentrations of raloxifen. Maximal values of the responses were 
20 arbitrarily set at 1 00%. 

Examples 

A. Molecular cloning of the novel estrogen receptor. 

25 

[0059] Two degenerate oligonucleotides containing inosines (I) were based on conserved regions of the DNA-binding 
domains and the ligand-binding domains of the human steroid hormone receptors. 

[0060] Primer #1: 

30 

5' -GGIGA (C/T ) GA (A/G) GC (A/T) TCIGGITG (C/T) CA (C/T) TA (C/T) GG-3 ' 

(SEQ ID NO:7). 

[0061] Primer #2: 

35 V 

S' -AAGCCTGG (C/G) A (C/T) IC (G/T) (C/T) TTIGCCCAI (C/T) TIAT-3' 

(SEQ ID NO:8). 

40 [0062] As template, cDNA from human EBV-stimulated PBLs (peripheral blood leukocytes) was used. One micro- 

gram of total RNA was reverse transcribed in a 20 jxl reaction containing 50 mM KCI, 10 mM Tris-HCI pH 8.3, 4 mM 
MgCI2, 1 mM dNTPs (Pharmacia), 100 pmol random hexanucleotides (Pharmacia), 30 Units RNAse inhibitor (Phar- 
macia) and 200 Units M-MLV Reverse transcriptase (Gibco BRL). Reaction mixtures were incubated at 37°C for 30 
minutes and heat- in activated at 1 00°C for 5 minutes. The cDNA obtained was used in a 1 00 pi PCR reaction containing 
45 10 mM Tris-HCI pH 8.3, 50 mM KCI, 1 .5 mM MgCI2, 0.001% gelatin (w/v), 3% DMSO, 1 microgram of primer #1 and 

primer #2 and 2.5 Units of Amplitaq DNA polymerase (Perkin Elmer). PCR reactions were performed in the Perkin 
Elmer 9600 thermal cycler. The initial denaturation (4 minutes at 94®C) was followed by 35 cycles with the following 
conditions: 30 sec. 94*C, 30 sec. 45°C, 1 minute 72°C and after 7 minutes at 72°C the reactions were stored at 4^. 
Aliquots of these reactions were analysed on a 1 .5% agarose gel. Fragments of interest were cut out of the gel, ream- 
50 plified using identical PCR-conditions and purified using Qiaex II (Qiagen). Fragments were cloned in the pCRII vector 
and transformed into bacteria using the TA-cloning kit (Invitrogen). Plasmid DNA was isolated for nucleotide sequence 
analysis using the Qiagen plasmid midi protocol (Qiagen). Nucleotide sequence analysis was performed with the ALF 
automatic sequencer (Pharmacia) using a T7 DNA sequencing kit (Pharmacia) with vector-specific or fragment-specific 
primers. 

55 [0063] One cloned fragment corresponded to a novel estrogen receptor (ER) which is closely related to the classical 

estrogen receptor. Part of the cloned novel estrogen receptor fragment (nucleotides 466 to 797 in SEQ ID 1) was 
amplified by PCR using oligonucleotide #3 T GTTA CG AAGTG G G AAT G GT GA (SEQ ID NO:9) and oligonucleotide #2 
and used as a probe to screen a human testis cDNA library in Xgtl 1 (Clontech #HL1 01 0b). Recombinant phages were 
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plated (using Y1 090 bacteria grown in LB medium supplemented with 0.2% maltose) at a density of 40.000 pfu (plaque- 
forming units) per 135 mm dish and replica filters (Hybond-N, Amersham) were made as described by the supplier. 
Filters were prehybridised in a solution containing 0,5 M phosphate buffer (pH 7.5) and 7% SDS at 65°C for at least 
30 minutes. DNA probes were purified with Qiaex II (Qiagen), 32 P-labe!ed with a Decaprime kit (Ambion) and added 
5 to the prehybridisation solution. Filters were hybridised at 65°C overnight and then washed in 0.5 X SSC/0.1% SDS 
at 65 Q C. Two positive plaques were Identified and could be shown to be identical. These clones were purified by 
rescreening one more time. A PCR reaction on the phage eiuates with the Xgtl 1 -specific primers #4: 5'-TTGACACCA- 
GACCAACTGGTAATG-3' (SEQ IDNO:10) and #5: 5‘-GGTGGCGACGACTCCTGGAGCCCG-3' (SEQ ID NO:1 1 ) yield- 
ed a fragment of 1700 basepairs on both clones. Subsequent PCR reactions using combinations of a gene-specific 
io primer #6: 5'-GTACACTGATTTGTAGCTGGAC-3' (SEQ ID NO:12) with the A,gt1 1 primer #4 and gene-specific primer 
#7: 5'-CCAT G AT GAT GT CCCT GACC-3' (SEQ ID NO: 13) with A,gt11 primer primer #5 yielded fragments of approxi- 
mately 450 bp and 1000 bp, respectively, which were cloned in the pCRII vector and used for nucleotide sequence 
analysis. The conditions for these PCR reactions were as described above except for the primer concentrations (200 
ng of each primer) and the annealing temperature (60°C). Since in the cDNA clone the homology with the ER is lost 
is abruptly at a site which corresponds to the exon 7/exon 8 boundary in the ER (between nucleotides 1 247 and 1 248 in 

SEQ ID NO:1), it was suggested that this sequence corresponds to intron 7 of the novel ER gene. For verification of 
the nucleotide sequences of this cDNA clone, a 1 200 bp fragment was generated on the cDNA clone with Xgtl 1 primer 
#4 with a gene-specific primer #8 corresponding to the 3' end of exon 7: 5'-TCGCATGCCTGACGTGGGAC-3' (SEQ 
ID NO: 1 4) using the proofreading Pfu polymerase (Stratagene). This fragment was also cloned in the pCRII vector and 
20 completely sequenced and was shown to be identical to the sequences obtained earlier. 

[0064] To obtain nucleotide sequences of the novel ER downstream of exon 7, a degenerate oligonucleotide based 
on the AF-2 region of the classical ER (#9: 5 , -GGC(C/G)TCCAGCATCTCCAG(C/G)A(A/G)CAG-3 , ; SEQ ID NO:15) 
was used together with the gene-specific oligonucleotide #10: 5'-GGAAGCTGGCTCACTTGCTG-3' (SEQ ID NO:16) 
using testis cDNA as template (Marathon ready testis cDNA, Clontech Cat #7414-1). A specific 220 bp fragment cor- 
25 responding to nucleotides 1112 to 1332 in SEQ ID No. 1 was cloned and sequenced. Nucleotides 1112 to 1247 were 

identical to the corresponding sequence of the cDNA clone. The sequence downstream thereof is highly homologous 
with the corresponding region in the classical ER. In order to obtain sequences of the novel ER downstream of the AF- 
2 region, RACE (rapid amplification of cDNA ends) PCR reactions were performed using the Marathon-ready testis 
cDNA (Clontech) as template. The initial PCR was performed using oligonucleotide #11: 5'-TCTTGTTCTGGACAG- 
30 GGATG-3’ (SEQ ID NO:17) in combination with the API primer provided in the kit. A nested PCR was performed on 
an aliquot of this reaction using oligonucleotide #10 (SEQ ID NO: 1 6) in combination with the oiigo dT primer provided 
in the kit. Subsequently, an aliquot of this reaction was used in a nested PCR using oligonucleotide#1 2: 5'-GCATGGAA- 
CATCTGCTCAAC-3' (SEQ ID NO:18) in combination with the oiigo dT primer. Nucleotide sequence analysis of a 
specific fragment that was obtained (corresponding to nucleotides 1256 to 1431 in SEQ ID NO 1) revealed a sequence 
35 encoding the carboxyterminus of the novel ER ligand-binding domain, including an F-domain and a translational stop 
codon and part of the 3‘ untranslated sequence which is not included in SEQ ID NO:1 . The deduced amino acid se- 
quence is shown in SEQ ID NO:5. 

[0065] In order to investigate the possibility that the novel estrogen receptor had additional, upstream translation- 
initiation codons, RACE-PCR experiments were performed using Marathon-ready testis cDNA (Clontech Cat. # 
40 7414-1). First a PCR was performed using oligonucleotide SEQ ID NO:12 (antisense corresponding to nucleotides 

416-395 in SEQ ID NO:1) and AP-1 (provided in the kit). A nested PCR was then performed using oligonucleotide 
having SEQ ID NO:27 (antisense corresponding to nucleotides 254-231 in SEQ ID NO: 1) with AP-2 (provided in the 
kit). From the smear that was obtained, the region corresponding to fragments larger than 300 basepairs was cut out, 
purified using the Genecleanll kit (Biol 01) and cloned using the TA-cloning kit (Clontech), Colonies were screened by 
45 PCR using gene-specific primers: SEQ ID NO:22 and SEQ ID NO:28. The clone containing the largest insert was 
sequenced. The nucleotide sequence corresponds to nucleotides 1 to 490 in SEQ ID NO:24. It is clear from this se- 
quence that the first in-frame upstream translation initiation codon is present at position 77-79 in SEQ ID NO:24. 
Upstream of this translational startcodon an in-frame stop-codon is present (11-13 in SEQ ID NO:24). Consequently, 
the reading frame of the novel estrogen receptor is 530 amino acids (shown in SEQ ID NO:25) and has a calculated 
50 mofecuiar mass of 59.234 kD. 

[0066] To confirm the nucleotide sequences obtained by 5' RACE, human genomic clones were obtained and ana- 
lysed. A human genomic library in XEMBL3 (Clontech HL1067J) was screened with a probe corresponding to nucle- 
otides 1 to 416 in SEQ ID NO:1 . A strongly hybridizing clone was plaque-purified and DNA was isolated using standard 
protocols (Sambrook et al, 1 989). The DNA was digested with several restriction enzymes, electrophoresed on agarose 
55 gel and blotted onto Nylon filters. Hybridisation of the blot with a probe corresponding to the above-mentioned RACE 
fragment (nucleotides 1 -490 in SEQ ID NO:24) revealed a hybridizing Sau3A fragment of approximately 800 basepairs. 
This fragment was cloned into the BamHI site of pGEM3Z and sequenced. The nucleotide sequence contained one 
base difference which is probably a PCR-induced point mutation in the RAC E fragment. Nucleotide 1 72 was a G residue 
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in the 5'RACE fragment, but an A residue in several independent genomic subclones. 

B. Identification of two splice variants of the novel estrogen receptor. 

5 [0067] Rescreening of the testis cDNA library with a probe corresponding to nucleotides 91 8 to 1 246 in SEQ ID No. 

1 yielded two hybridizing clones, the 3' end of which were amplified by PCR (gene-specific primer #1 0: 5'-GGAAGCT - 
GGCTCACTTGCTG-3' (SEQ ID NO:16) together with primer #4, SEQ ID NO:10), cloned and sequenced. One clone 
was shown to contain an alternative exon 8 (exon 8B) of the novel ER. In SEQ ID No. 2 the protein encoding part and 
the stopcodon of this splice variant are presented. As a consequence of the introduction of this exon through an alter- 
io native splicing reaction, the reading frame encoding the novel ER is immediately terminated, thereby creating a trun- 
cation of the carboxyterminus of the novel ER (SEQ ID NO:6). 

[0068] Screening of a human thymus cDNA library (Clontech HL1 074a) with the probe corresponding to nucleotides 
918 to 1246 in SEQ ID No. 1 , revealed another splice variant. The 3* end of one hybridizing clone was amplified using 
primer #1 0 (SEQ-ID NO:1 6) with the Xgtl 0-specific primer #1 3 5‘-AGCAAGTTCAGCCTGTTAAGT-3' (SEQ ID NO: 1 9), 
15 cloned and sequenced. The obtained nucleotide sequence upstream of the exon 7/exon 8 boundary was identical to 
the clones identified earlier. However, an alternative exon 8 (exon 8C) was present at the 3' end encoding two C- 
temninal amino acids followed by a stop-codon. The nucleotide sequence of the protein-encoding part of this splice 
variant is shown in SEQ ID NO:20, the corresponding protein sequence is SEQ ID NO:21 . 

[0069] These two variants of the novel estrogen receptor do not contain the AF-2 region and therefore probably lack 
20 the ability to modulate transcription of target genes in a ligand-dependent fashion. However, the variants potentially 
could interfere with the functioning of the wild-type classical ER and/or the wild-type novel ER, either by heterodimer- 
ization or by occupying estrogen response elements or by interactions with other transcription factors. A mutant of the 
classical ER (ER1 -530) has been described which closely resembles the two variants of the novel estrogen receptor 
described above. ER1-530 has been shown to behave as a dominant-negative receptor i.e. it can modulate the intra- 
25 cellular activity of the wild type ER (Ince etaf, J. Biol. Chem. 268 , 14026-14032, 1993). 

C. Northern blot analysis. 

[0070] Human multiple tissue Northern blots (MTN-blots) were purchased from Clontech and prehybridized for at 
30 least 1 hour at 65°C in 0.5 M phosphate buffer pH 7.5 with 7% SDS. The DNA fragment that was used as a probe 
(corresponding to nucleotides 466 to 797 in SEQ ID No. 1) was 32 P-labeled using a labelling kit (Ambion), denatured 
by boiling and added to the prehybridisation solution. Washing conditions were: 3X SSC at room temperature, followed 
by 3 X SSC at 65 # C, and finally 1 X SSC at 65°C. The filters were than exposed to X-ray films for one week. Two 
transcripts of approximately 8 kb and 10 kb were detected in thymus, spleen, ovary and testis. In addition, a 1.3 kb 
35 transcript was detected in testis. 

D. RT-PCR analysis of expression of ERa and ERfi in cell lines. 

[0071] RNA was isolated from a number of human and animal cell lines using RNAzol B (Cinna/Biotecx). cDNA was 
4 o made using 2.5 microgram of total RNA using the Superscript II kit (BRL) following the manufacturers instructions. A 
portion of the cDNA was used for specific PCR amplifications of fragments corresponding either to mRNA encoding 
the ER or to the novel estrogen receptor. (It should be emphasized that the primers used are based on human and rat 
sequences, whereas some of the cell lines were not rat or human, see legend of Figure 4). Primers used were for ERa: 
sense 5‘-GATGGGCTTACTGACCAACC-3' and antisense 5'-AGATGCTCCATGCCTTTG-3' generating a 548 base pair 
45 fragment corresponding to part of the LBD. For ER{3: sense 5'- TTCACCGAGGCCTCCAT GAT G-3 1 and antisense 5'-CA- 

GATGTTCCATGCCCTTGTT-3' generating a 565 base pair fragment corresponding to part of the LBD. The PCR sam- 
ples were analysed on agarose which were blotted onto Nylon membranes. These blots were hybridised with 32 P- 
labeled PCR fragments, generated with the above-mentioned primers on ERa and ER(3 plasmid DNA using standard 
experimental procedures (Sambrook el al, 1989). 

50 

E. Ligand-dependent transcription activation by the novel estrogen receptor protein. 

Cell culture 

55 [0072] Chinese Hamster Ovary (CHO K1) cells were obtained from ATCC (CCL61) and maintained at 37°C in a 

humidified atmosphere (5% C0 2 ) as a monolayer culture in fenolred-free M505 medium. The latter medium consists 
of a mixture (1:1) of Dulbecco's Modified Eagle's Medium (DMEM, Gibco 074-200) and Nutrient Medium FI 2 (Ham's 
F12, Gibco 074-1 700) supplemented with 2.5 mg/ml sodium carbonate (Baker), 55 pg/ml sodium pyruvate (Fluka), 2.3 
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|ig/mt p-mercaptoethanol (Baker), 1 .2 pg/ml ethanolamine (Baker), 360 jxg/ml L-glutamine (Merck), 0.45 pg/ml sodium 
selenite (Fluka), 62.5 ^g/ml penicillin (Mycopharm), 62.5 pg/ml streptomycin (Serva), and 5% charcoal-treated bovine 
calf serum (Hyclone). 

5 Recombinant vectors 

[0073] The ERfl-encoding sequence as presented in SEQ ID No. 1 was amplified by PCR using oligonucleotides 
5'-CTT GGATCC ATAGCCCTGCT GTG AT G AATTACAG-3' (SEQ ID NO:22 underlined is the translation initiation codon) 
in combination with 5'-GATGGATCCTCACCTCAGGGCCAGGCG TCA CTG-3' (SEQ ID NO:23) (underlined is the trans- 
w lation stopcodon, antisense). The resulting BamHI fragment (approximately 1450 base pairs) were then cloned in the 
mammalian cell expression vector pNGVI (Genbank accession No. X99274). 

[0074] An expression construct encoding the ERp reading frame as presented in SEQ ID NO:24 was made by re- 
placing a BamHI -Mscl fragment (nucleotides 1-81 in SEQ ID No. 1) by a BamHI -Mscl fragment corresponding to 
nucleotides 77-31 6 in SEQ ID No. 24. The latter fragment was made by PCR with SEQ ID NO:26 in combination with 
15 SEQ ID NO:28 using the above mentioned 5' RACE fragment. 

[0075] The reporter vector was based on the rat oxytocin gene regulatory region (position -363/+16 as a Hindlll/ 
Mbol fragment; R.lvell, and D. Richter, Proc.Natl.Acad.Sci.USA 8T 2006-2010, 1984) linked to the firefly luciferase 
encoding sequence; the regulatory region of the oxytocin gene was shown to possess functional estrogen hormone 
response elements in vitro for both the rat (R.Adan et ai, Biochem.Biophys.Res.Comm. 175 , 117-122, 1991) and the 
20 human (S. Richard, and H.Zingg, J.Biol.Chem. 265, 6098-6103, 1990). 

Transient transfection 

[0076] 1 x 10 5 CHO cells were seeded in 6-wells Nunclon tissue culture plates and DNA was introduced by use of 

25 lipofectin (Gibco BRL). Hereto, the DNA (1 (ig of both receptor and reporter vector in 250 p.L Optimern, Gibco BRL) 
was mixed with an equal volume of lipofectin reagent (7 pL in 250 pL Optimern, Gibco) and allowed to stand at room 
temperature for 1 5 min. After washing the cells twice with serum-free medium (M505) new medium (500 pL Optimern, 
Gibco) was added to the cells followed by the dropwise addition of the DNA-lipofectin mixture. After incubation for a 5 
hour period at 37°C ceils were washed twice with fenolred-free M505 + 5% charcoal-treated bovine calf serum and 
30 incubated overnight at 37°C. After 24 hours hormones were added to the medium (10' 7 mol/L). Cell extracts were 
made 48 hours posttransfection by the addition of 200 pL lysisbuffer (0.1 M phosphate buffer pH7.8, 0.2% Triton X- 
100). After incubation for 5 min at 37°C the cell suspension was centrifuged (Eppendorf centrifuge, 5 min) and 20 pL 
sample was added to 50 pL luciferase assay reagent (Promega). Light emission was measured in a luminometer 
(Berthold Biolumat) for 10 sec at 562 nm. 

35 

Stable transfection of the novel estrogen receptor. 

[0077] The expression plasmid encoding full-length ER{31 -530 (see above) was stably transfected in CHO K1 cells 
as previously described (Theunissen eta!., J. Biol. Chem. 268, 9035-9040, 1993). Single cell clones that were obtained 
40 this way were screened by transient transfection of the reporter plasmid (rat oxytocin-lucif erase) as described above. 
Selected clones were used for a second stable transfection of the rat oxytocin-luciferase reporter plasmid together 
with the plasmid pDR2A which contains a hygromycine resitance gene for selection. Single cell clones obtained were 
tested for a response to 1 7J3-estradioI. Subsequently, a selected single cell clone was used for transactivation studies. 
Briefly, cells were seeded in 96-wells at (1.6x10 4 cells per well). After 24 hours different concentrations of hormone 
45 were diluted in medium and added to the wells. For antagonistic experiments, 2x1 O’ 10 M. 17p-estradio! was added to 
each well and different concentrations of antagonists were added. Cells were washed once with PBS after a 24 hour 
incubation and then lysed by the addition of 40 microliter lysis buffer (see above). Luciferase reagent was added (50 
microliter) to each well and light emission was measured using the Topcount (Packard). 

50 Results. 

[0078] A comparison of the two expression constructs (SEQ ID NO:1 and SEQ ID NO:24) in transient transfections 
in CHO cells showed identical transactivation in response to a number of agonists and antagonists. CHO cells tran- 
siently transfected with ER(3 expression vector and a reporter plasmid showed a 3 to 4 fold increase in luciferase activity 
55 in response to 1 7{3-estradiol as compared to untreated cells (see Figure 2). A similar transactivation was obtained upon 

treatment with estriol and estrone. The results indicate not only that the novel ER (ERp) can bind estrogen hormones 
but also that the ligand-activated receptor can bind to the estrogen-response elements (EREs) within the rat oxytocin 
promoter and activate transcription of the luciferase reporter gene. Figure 3 shows that in an independent similar 
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experiment 1 0' 9 mol/L 1 7p-estradiol gave an 1 8-fold stimulation with ERa and a 7-fold stimulation with ERp. In addition, 
the antiestrogen 1CI-1 64384 was shown to be an antagonist for both ERa and ERp when activated with 1 7p-estradiol, 
whereas the antagonist alone had no effect. In this experiment 0.25 ^g p-galactosidase vector was co-transfected in 

order to normalize for differences in transfection efficiency. 

5 [ 0079 ] Transactivation studies performed on stably transfected ERa and ERp cell lines gave similar absolute luci- 

ferase values. The curves for 1 7{J-estradiol are very similar and show that half-maximal transactivation is reached with 
lower concentrations of hormone on ERa as compared to ERp (Figure 5). For Org4094 this is also the case however, 
the effect observed is much more pronounced. The curves for raloxifen show that the potency of this antagonist to 
block transactivation on ERa is greater compared to its potency to block ERp transactivation. 

10 

SEQUENCE LISTING 
[ 0080 ] 

15 (1) GENERAL INFORMATION: 

(i) APPLICANT: 

.(A) NAME: Akzo nobel n.v. 

20 (B) STREET: Velperweg 76 

(C) CITY: Arnhem 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP): 6824 BM 

(G) TELEPHONE: 0412-666379 

25 (H) TELEFAX: 0412-650592 

(I) TELEX: 37503 akpha nl 

(ii) TITLE OF INVENTION: Novel estrogen receptor 

30 (iii) NUMBER OF SEQUENCES: 28 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
35 (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(2) INFORMATION FOR SEQ ID NO: 1: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1434 base pairs 

(B) TYPE: nucleic acid 

45 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

so (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 : 



55 
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ATGAATTACA GCATTCCCAG CAATGTCACT AACTTGGAAG GTGGGCCTGG TCGGCAGACC 

5 

ACAAGCCCAA ATGTGTTGT G GCCAACACCT GGGCACCTTT CTCCTTTAGT GGTCCATCGC 
CAGTTATCAC ATCTGTATGC GGAACCTCAA AAGAGTCCCT GGTGTGAAGC AAGATCGCTA 

10 

GAACACACCT TACCTGTAAA CAGAGAGACA CTGAAAAGGA AGGTTAGTGG GAACCGTTGC 

GCCAGCCCTG TTACTGGTCC AGGTTCAAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 

15 

GATTACGCAT CGGGATATCA CTATGGAGTC TGGTCGTGTG AAGGATGTAA GGCCTTTTTT 
20 AAAAGAAGCA TTCAAGGACA TAATGATTAT ATTTGTCCAG CTACAAATCA GTGTACAATC 
GATAAAAACC GGCGCAAGAG CTGCCAGGCC TGCCGACTTC GGAAGTGTTA CGAAGTGGGA 

25 

ATGGTGAAGT GTGGCTCCCG GAGAGAGAGA TGTGGGTACC GCCTTGTGCG GAGACAGAGA 

AGTGCCGACG AGCAGCTGCA CTGTGCCGGC AAGGCCAAGA GAAGTGGCGG CCACGCGCCC 

30 

CGAGTGCGGG AGCTGCTGCT GGACGCCCTG AGCCCCGAGC AGCTAGTGCT CACCCTCCTG 
GAGGCTGAGC CGCCCCATGT GCTGATCAGC CGCCCCAGTG CGCCCTTCAC CGAGGCCTCC 

35 

AT GAT GAT GT CCCTGACCAA GTTGGCCGAC AAGGAGTTGG TACACATGAT CAGCTGGGCC 
40 AAGAAGATTC CCGGCTTTGT GGAGCTCAGC CTGTTCGACC AAGTGCGGCT CTTGGAGAGC 
TGTTGGATGG AGGTGTTAAT GATGGGGCTG ATGTGGCGCT CAATTGACCA CCCCGGCAAG 

45 

CTCATCTTTG CTCCAGATCT TGTTCTGGAC AGGGATGAGG GGAAATGCGT AGAAGGAATT 
CTGGAAATCT TTGACATGCT CCTGGCAACT ACTTCAAGGT TTCGAGAGTT AAAACTCCAA 

50 

CACAAAGAAT ATCTCTGTGT CAAGGCCATG ATCCTGCTCA ATTCCAGTAT GTACCCTCTG 
GTCACAGCGA CCCAGGATGC TGACAGCAGC CGGAAGCTGG CTCACTTGCT GAACGCCGTG 

55 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

* 

1080 

1140 
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ACCGATGCTT TGGTTTGGGT GATTGCCAAG AGCGGCATCT CCTCCGAGCA GCAATCCATG 
CGCCTGGCTA ACCTCCTGAT GCTCCTGTCC CACGTCAGGC ATGCGAGTAA CAAGGGCATG 
GAACATCT GC TCAACATGAA GTGCAAAAAT GTGGTCCCAG TGTATGACCT GCTGCTGGAG 
ATGCTGAATG CCCACGTGCT TCGCGGGTGC AAGTCCTCCA TCACGGGGTC CGAGTGCAGC 

CCGGCAGAGG ACAGTAAAAG CAAAGAGGGC TCCCAGAACC CACAGTCTCA GTGA 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1251 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

ATGAATTACA GCATTCCCAG CAATGTCACT AACTTGGAAG GTGGGCCTGG TCGGCAGACC 
ACAAGCCCAA ATGTGTTGTG GCCAACACCT GGGCACCTTT CTCCTTTAGT GGTCCATCGC 
CAGTTATCAC ATCTGTATGC GGAACCTCAA AAGAGTCCCT GGTGTGAAGC AAGATCGCTA 
GAACACACCT TACCTGTAAA CAGAGAGACA CTGAAAAGGA AGGTTAGTGG GAACCGTTGC 
GCCAGCCCTG TTACTGGTCC AGGTTCAAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 
GATTACGCAT CGGGATATGA CTATGGAGTC TGGTCGTGTG AAGGATGTAA GGCCTTTTTT 
AAAAGAAGCA TTCAAGGACA TAATGATTAT ATTTGTCCAG CTACAAATCA GTGTACAATC 



1200 

1260 

1320 

1380 

1434 



60 

120 

180 

240 

300 

'•360 

420 



GATAAAAACC GGCGCAAGAG CTGCCAGGCC TGCCGACTTC GGAAGTGTTA CGAAGTGGGA 
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ATGGTGAAGT 


GTGGCTCCCG 


GAGAGAGAGA 


TGTGGGTACC 


GCCTTGTGCG 


GAGACAGAGA 


540 


5 


AGTGCCGACG 


AGCAGCTGCA 


CTGTGCCGGC 


AAGGCCAAGA 


GAAGTGGCGG 


CCACGCGCCC 


600 




CGAGTGCGGG 


AGCTGCTGCT 


GGACGCCCTG 


AGCCCCGAGC 


AGCTAGTGCT- CACCCTCCTG 


660 


10 


GAGGCTGAGC 


CGCCCCATGT 


GCTGATCAGC 


CGCCCCAGTG 


CGCCCTTCAC 


CGAGGCCTCC 


720 


15 


AT GAT GAT GT 


CCCTGACCAA 


GTTGGCCGAC 


AAGGAGTTGG 


TACACATGAT 


GAGCTGGGCC 


780 




AAGAAGATTC 


CCGGCTTTGT 


GGAGCTCAGC 


CTGTTCGACC 


AAGTGCGGCT 


CTTGGAGAGC 


840 


20 


TGTTGGATGG 


AGGTGTTAAT 


GATGGGGCTG 


ATGTGGCGCT 


CAATTGACCA 


CCCCGGCAAG 


900 




CTCATCTTTG 


CTCCAGATCT 


TGTTCTGGAC 


AGGGATGAGG 


GGAAATGCGT 


AGAAGGAATT 


960 


25 


CTGGAAATCT 


TTGACATGCT 


CCTGGCAACT 


ACTTCAAGGT 


TTCGAGAGTT 


AAAACTCCAA 


1020 


30 


CACAAAGAAT 


ATCTCTGTGT 


CAAGGCCATG 


ATCCTGCTCA 


ATTCCAGTAT 


GTACCCTCTG 


1080 


GTCACAGCGA 


CCCAGGATGC 


TGACAGCAGC 


CGGAAGCT GG 


CTCACTTGCT 


GAACGCCGTG 


1140 


35 


ACCGATGCTT 


TGGTTTGGGT 


GATTGCCAAG 


AGCGGCATCT 


CCTCCCAGCA 


GCAATCCATG 


1200 




CGCCTGGCTA 


ACCTCCTGAT 


GCTCCTGTCC 


CACGTCAGGC 


ATGCGAGGTG 


A 


1251 



40 (2) INFORMATION FOR SEQ ID NO: 3: 

{!) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 
45 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
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5 



10 



15 



Cys Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp 
1 5 .10 15 

Ser Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser He Gin Gly His 
20 25 30 

Asn Asp Tyr lie Cys Pro Ala Thr Asn Gin Cys Thr lie Asp Lys Asn 
35 40 45 

Arg Arg Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys Cys Tyr Glu Val 
50 55 60 



20 Gly Met 

65 



25 



30 



35 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



40 



45 



50 



Leu Val Leu Thr Leu Leu Glu Ala Glu Pro Pro His Val Leu lie Ser 

1 5 10 15 

Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser Met Met Met Ser Leu Thr 

20 25 30 

Lys Leu Ala Asp Lys Glu Leu Val His Met lie Ser Trp Ala Lys Lys 
35 40 45 



55 



15 
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lie Pro Gly Phe Val Glu Leu Ser Leu Phe Asp Gin Val Arg Leu Leu 
50 55 60 



Glu Ser Cys Trp Met Glu Val Leu Met Met Gly Leu Met Trp Arg Ser 

65 70 75 80 

lie Asp His Pro Gly Lys Leu lie Phe Ala Pro Asp Leu Val Leu Asp 

85 90 95 



15 



Arg Asp Glu Gly Lys Cys Val Glu Gly lie Leu Glu lie Phe Asp Met 
100 * 105 110 



20 



Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu Leu Lys Leu Gin His Lys 
115 120 125 



25 



Glu Tyr Leu Cys Val Lys Ala Met lie Leu Leu Asn Ser Ser Met Tyr 
130 135 140 



Pro Leu Val Thr Ala Thr Gin Asp Ala Asp Ser Ser Arg Lys Leu Ala 

145 150 155 160 

His Leu Leu Asn Ala Val Thr Asp Ala Leu Val Trp Val lie Ala Lys 

165 170 175 



35 



Ser Gly lie Ser Ser Gin Gin Gin Ser Met Arg Leu Ala Asn Leu Leu 
180 185 190 



40 



Met Leu Leu Ser His Val Arg His Ala Ser Asn Lys Gly Met Glu His 
195 200 205 



45 



Leu Leu Asn Met Lys Cys Lys Asn Val Val Pro Val Tyr Asp Leu Leu 
210 215 220 



Leu Glu Met Leu Asn Ala His Val Leu 
50 225 230 

(2) INFORMATION FOR SEQ ID NO: 5: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 477 amino acids 

(B) TYPE: amino acid 



16 




5 



10 



15 



20 



25 



30 



35 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Asn Tyr Ser lie Pro Ser Asn Val Thr Asn Leu Glu Gly Gly Pro 
1 5 10 15 

Gly Arg Gin Thr Thr Ser Pro Asn Val Leu Trp Pro Thr Pro Gly His 

20 25 30 

Leu Ser Pro Leu Val Val His Arg Gin Leu Ser His Leu Tyr Ala Glu 
35 40 45 

Pro Gin Lys Ser Pro Trp Cys Glu Ala Arg Ser Leu Glu His Thr Leu 
50 55 60 

Pro Val Asn Arg Glu Thr Leu Lys Arg Lys Val Ser Gly Asn Arg Cys 
65 70 75 80 

Ala Ser Pro Val Thr Gly Pro Gly Ser Lys Arg Asp Ala His Phe Cys 

85 90 95 

Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp Ser 
100 105 110 



Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser He Gin Gly His Asn 
115 120 125 

Asp Tyr lie Cys Pro Ala Thr Asn Gin Cys Thr, lie Asp Lys Asn Arg 
130 135 140 



50 



55 
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5 



10 



20 



25 



30 



35 



40 



45 



55 



Arg Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys Cys Tyr Glu Val Gly 

145 150 155 160 

Met Val Lys Cys Gly Ser Arg Arg Glu Arg Cys Gly Tyr Arg Leu Val 

165 170 175 

Arg Arg Gin Arg Ser Ala Asp Glu Gin Leu His Cys Ala Gly Lys Ala 
180 185 190 

Lys Arg Ser Gly Gly His Ala Pro Arg Val Arg Glu Leu Leu Leu Asp 
195 200 205 

Ala Leu Ser Pro Glu Gin Leu Val Leu Thr Leu Leu Glu Ala Glu Pro 
210 215 220 

Pro His Val Leu He Ser Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser 

225 230 235 240 

Met Met Met Ser' Leu Thr Lys Leu Ala Asp Lys Glu Leu Val His Met 

245 250 255 

lie Ser Trp Ala Lys Lys lie Pro Gly Phe Val Glu Leu Ser Leu Phe 
260 265 270 

Asp Gin Val Arg Leu Leu Glu Ser Cys Trp Met Glu Val Leu Met Met 
275 280 285 

Gly Leu Met Trp Arg Ser lie Asp His Pro Gly Lys Leu lie Phe Ala 
290 295 300 

Pro Asp Leu Val Leu Asp Arg Asp Glu Gly Lys Cys Val Glu Gly lie 

305 310 315 320 

Leu Glu lie Phe Asp Met Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu 

325 330 335 

Leu Lys Leu Gin His Lys Glu Tyr Leu Cys Val Lys Ala Met lie Leu 
340 345 350 
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Leu Asn Ser Ser Met Tyr Pro Leu Val Thr Ala Thr Gin Asp Ala Asp 

355 360 365 

5 

Ser Ser Arg Lys Leu Ala His Leu Leu Asn Ala Val Thr Asp Ala Leu 

370 375 380 

10 

Val Trp Val He Ala Lys Ser Gly lie Ser Ser Gin Gin Gin Ser Met 

385 390 395 400 

15 Arg Leu Ala Asn Leu Leu Met Leu Leu Ser His Val Arg His Ala Ser 

405 410 415 

2Q Asn Lys Gly Met Glu His Leu Leu Asn Met Lys Cys Lys Asn Val Val 

420 425 430 

Pro Val Tyr Asp Leu Leu Leu Glu Met Leu Asn Ala His Val Leu Arg 

25 435 440 445 

Gly Cys Lys Ser Ser lie Thr Gly Ser Glu Cys Ser Pro Ala Glu Asp 

M 450 455 460 

30 

Ser Lys Ser Lys Glu Gly Ser Gin Asn Pro Gin Ser Gin 

465 470 475 

35 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

40 (A) LENGTH: 416 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: unknown 

45 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

50 Met Asn Tyr Ser lie Pro Ser Asn Val Thr Asn Leu Glu' Gly Gly Pro 



55 



19 
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5 



10 



15 



20 



1 



10 



15 



Gly Arg Gin Thr Thr Ser Pro Asn Val Leu Trp Pro Thr Pro Gly His 
20 25 30 

Leu Ser Pro Leu Val Val His Arg Gin Leu Ser His Leu Tyr Ala Glu 
35 40 45 

Pro Gin Lys Ser Pro Trp Cys Glu Ala Arg Ser Leu Glu His Thr Leu 
50 55 60 

Pro Val Asn Arg Glu Thr Leu Lys Arg Lys Val Ser Gly Asn Arg Cys 

65 70 75 80 

Ala Ser Pro Val Thr Gly Pro Gly Ser Lys Arg Asp Ala His Phe Cys 

85 90 95 



25 



30 



35 



40 



45 



50 



Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp Ser 
100 105 110 

Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser lie Gin Gly His Asn 
115 120 125 

Asp Tyr lie Cys Pro Ala Thr Asn Gin Cys Thr lie Asp Lys Asn Arg 
130 135 140 

Arg Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys Cys Tyr Glu Val Gly 

145 150 > 155 160 

Met Val Lys Cys Gly Ser Arg Arg Glu Arg Cys Gly Tyr Arg Leu Val 

165 170 175 

Arg Arg Gin Arg Ser Ala Asp Glu Gin Leu His Cys Ala Gly Lys Ala 
180 185 190 

Lys Arg Ser Gly Gly His Ala Pro Arg Val Arg Glu Leu Leu Leu Asp 
195 200 205 



55 



Ala Leu Ser Pro Glu Gin Leu val Leu Thr Leu Leu Glu Ala Glu Pro 



20 
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210 215 220 

5 Pro His Val Leu lie Ser Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser 

225 230 235 240 

jo Met Met Met Ser Leu Thr Lys Leu Ala Asp Lys Glu Leu Val His Met 

245 250 255 



He Ser Trp Ala Lys Lys lie Pro Gly Phe Val Glu Leu Ser Leu Phe 
260 265 270 



20 



Asp Gin Val Arg Leu Leu Glu Ser Cys Trp Met Glu Val Leu Met Met 
275 280 205 



25 



30 



Gly Leu Met Trp Arg Ser lie Asp His Pro Gly Lys Leu lie Phe Ala 
290 295 300 

Pro Asp Leu Val Leu Asp Arg Asp Glu Gly Lys Cys Val Glu Gly He 

305 310 315 320 

Leu Glu lie Phe Asp Met Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu 

325 330 335 



35 



Leu Lys Leu Gin His Lys Glu Tyr Leu Cys Val Lys Ala Met lie Leu 
340 345 350 



4o Leu Asn Ser Ser Met Tyr Pro Leu Val Thr Ala Thr Gin Asp Ala Asp 

355 360 365 

r 

Ser Ser Arg Lys Leu Ala His Leu Leu Asn Ala Val Thr Asp Ala Leu 
45 370 375 380 



Val Trp Val lie Ala Lys Ser Gly lie Ser Ser Gin Gin Gin Ser Met 
385 390 395 400 



Arg Leu Ala Asn Leu Leu Met Leu Leu Ser His Val Arg His Ala Arg 
405 410 415 

55 

(2) INFORMATION FOR SEQ ID NO: 7: 



21 




(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



GGIGAYGARG CWTCIGGITG YCAYTAY GG 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



AAGCCTGGSA YICKYTTIGC CCAIYTIAT 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



TGTTACGAAG TGGGAATGGT GA 

(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TTGACACCAG ACCAACTGGT AATG 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single . 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



GGTGGCGACG ACTCCTGGAG CCCG 



30 



35 



40 



(2) INFORMATION FOR SEQ ID NO: 12: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GTACACTGAT TTGTAGCTGG AC 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(II) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



24 



24 



22 



23 




CCATGATGAT GTCCCTGACC 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TCGCATGCCT GACGTGGGAC 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GGCSTCCAGC ATCTCCAGSA RCAG 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GGAAGCTGGC TCACTTGCTG 



(2) INFORMATION FOR SEQ ID NO: 17: 




EP 0 798 378 B1 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

io (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



TCTTGTTCTG GACAGGGATG 



20 



15 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



30 



GCATGGAACA TCTGCTCAAC 



20 



35 



40 



45 



(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



AGCAAGTTCA GCCTGTTAAG T 21 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 1257 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



25 
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(ii) MOLECULE TYPE: cDNA 

60 
120 
180 
240 
300 
360 

25 

30 

35 

40 

45 

50 

55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

5 

ATGAATTACA GCATTCCCAG CAATGTCACT AACTTGGAAG GTGGGCCTGG TCGGCAGACC 

ACAAGCCCAA atgtgttgtg gccaacacct gggcaccttt ctcctttagt ggtccatcgc 

10 

CAGTTATCAC atctgtatgc ggaacctcaa AAGAGTCCCT GGTGTGAAGC AAGATCGCTA 
15 GAACACACCT TACCTGTAAA CAGAGAGACA CTGAAAAGGA AGGTTAGTGG GAACCGTTGC 

GCCAGCCCTG TTACTGGTCC AGGTTCAAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 

20 

GATTACGCAT CGGGATATCA CTATGGAGTC TGGTCGTGTG AAGGATGTAA GGCCTTTTTT 



26 




EP 0 798 378 B1 





AAAAGAAGCA 


TTCAAGGACA 


TAATGATTAT 


ATTTGTCCAG 


CTACAAATCA 


GTGTACAATC 


420 


5 


G AT AAAAAC C 


GGCGCAAGAG 


CTGCCAGGCC 


TGCCGACTTC 


GGAAGTGTTA 


CGAAGTGGGA 


460 




ATGGTGAAGT 


GTGGCTCCCG 


GAGAGAGAGA 


TGTGGGTACC 


GCCTTGTGCG 


GAGACAGAGA 


540 


10 


AGTGCCGACG 


AGCAGCTGCA 


CTGTGCCGGC 


AAGGCCAAGA 


GAAGTGGCGG 


CCACGCGCCC 


600 


15 


CGAGTGCGGG 


AGCTGCTGCT 


GGACGCCCTG 


AGCCCCGAGC 


AGCTAGTGCT 


CACCCTCCTG 


660 


GAGGCTGAGC 


CGCCCCATGT 


GCTGATCAGC 


CGCCCCAGTG 


CGCCCTTCAC 


CGAGGCCTCC 


720 


20 


ATGATGATGT 


CCCTGACCAA 


GTTGGCCGAC 


AAGGAGTTGG 


TACACATGAT 


CAGCTGGGCC 


780 




AAGAAGATTC 


CCGGCTTTGT 


GGAGCTCAGC 


CTGTTCGACC 


AAGTGCGGCT 


CTTGGAGAGC 


840 


25 


TGTTGGATGG 


AGGTGTTAAT 


GATGGGGCTG 


ATGTGGCGCT 


CAATTGACCA 


CCCCGGCAAG 


900 




CTCATCTTTG 


CTCCAGATCT 


TGTTCTGGAC 


AGGGATGAGG 


GGAAATGCGT 


AGAAGGAATT 


960 


30 


CTGGAAATCT 


TTGACATGCT 


CCTGGCAACT 


ACTTCAAGGT 


TTCGAGAGTT 


AAAACTCCAA 


1020 


35 


CACAAAGAAT 


ATCTCTGTGT 


CAAGGCCATG ATCCTGCTCA 


ATTCCAGTAT 


GTACCCTCTG 


1080 


GTCACAGCGA 


CCCAGGATGC 


TGACAGCAGC 


CGGAAGCTGG 


CTCACTTGCT 


GAACGCCGTG 


1140 


40 


ACCGATGCTT 


TGGTTTGGGT 


GATTGCCAAG AGCGGCATCT 


CCTCCCAGCA 


GCAATCCATG 


1200 




CGCCTGGCTA 


ACCTCCTGAT 


GCTCCTGTCC 


CACGTCAGGC 


ATGCGAGGTC 


TGCCTGA 


1257 


45 


(2) INFORMATION FOR SEQ ID 


NO: 21: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 418 amino acids 
50 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

55 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



27 
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Met Asn Tyr Ser He Pro Ser Asn Val Thr Asn Leu Glu Gly Gly Pro 
15 10 15 

Gly Arg Gin Thr Thr Ser Pro Asn Val Leu Trp Pro Thr Pro Gly His 
20 25 30 



10 



15 



Leu Ser Pro Leu Val Val His Arg Gin Leu Ser His Leu Tyr Ala Glu 
35 40 45 

Pro Gin Lys Ser Pro Trp Cys Glu Ala Arg Ser Leu Glu His Thr Leu 
50 55 60 

Pro Val Asn Arg Glu Thr Leu Lys Arg Lys Val Ser Gly Asn Arg Cys 
65 70 75 80 



25 



35 



40 



45 



50 



Ala Ser Pro Val Thr Gly Pro Gly Ser Lys Arg Asp Ala His Phe Cys 
85 90 95 

Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp Ser 
100 105 110 

Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser lie Gin Gly His Asn 
115 120 125 

Asp Tyr lie Cys Pro Ala Thr Asn Gin Cys Thr lie Asp Lys Asn Arg 
130 135 140 

Arg Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys Cys Tyr Glu Val Gly 

145 150 155 160 

Met Val Lys Cys Gly Ser Arg Arg Glu Arg Cys Gly Tyr Arg Leu Val 

165 170 175 

Arg Arg Gin Arg Ser Ala Asp Glu Gin Leu His Cys Ala Gly Lys Ala 



55 



28 




180 
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185 190 



5 



Lys Arg Ser Gly Gly His Ala Pro Arg Val Arg Glu Leu Leu Leu Asp 
195 200 205 



Ala Leu Ser Pro Glu Gin Leu Val Leu Thr Leu Leu Glu Ala Glu Pro 
210 215 220 



Pro His Val Leu He Ser Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser 
225 230 235 240 



Met Met Met Ser Leu Thr Lys Leu Ala Asp Lys Glu Leu Val His Met 
245 250 255 

20 



lie Ser Trp Ala Lys Lys lie Pro Gly Phe Val Glu Leu Ser Leu Phe 
260 265 270 



25 



Asp Gin Val Arg Leu Leu Glu Ser Cys Trp Met Glu Val Leu Met Met 
275 260 285 



Gly Leu Met Trp Arg Ser lie Asp His Pro Gly Lys Leu lie Phe Ala 
290 295 300 



35 Pro Leu v al Leu Asp Arg Asp Glu Gly Lys Cys Val Glu Gly lie 

305 310 315 3 20 

Leu Glu lie Phe Asp Met Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu 
40 325 330 335 



Leu Lys Leu Gin His Lys Glu Tyr Leu Cys Val Lys Ala Met lie Leu 
3 ^0 345 350 

45 



Leu Asn Ser Ser Met Tyr Pro Leu Val Thr Ala Thr Gin Asp Ala Asp 
355 360 365 



50 

Ser Ser Arg Lys Leu Ala His Leu Leu Asn Ala Val Thr Asp Ala Leu 
370 375 380 



val Trp Val lie Ala Lys Ser Gly lie Ser Ser Gin Gin Gin Ser Met 



29 
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385 . 390 395 400 

5 

Arg Leu Ala Asn Leu Leu Met Leu Leu Ser His Val Arg His Ala Arg 
405 410 415 

io Ser Ala 



(2) INFORMATION FOR SEQ ID NO: 22: 

is (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 
(8) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

25 

CTTGGATCCA TAGCCCTGCT GTGATGAATT ACAG 34 



30 (2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



GATGGATCCT CACCTCAGGG CCAGGCGTCA CTG 33 

(2) INFORMATION FOR SEQ ID NO: 24: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1898 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



30 
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CACGAATCTT 


TGAGAACATT 


ATAATGACCT 


TTGTGCCTCT 


TCTTGCAAGG 


TGTTTTCTCA 


60 


5 


GCTGTTATCT 


CAAGACATGG 


ATATAAAAAA 


CTCACCATCT 


AGCCTTAATT 


CTCCTTCCTC 


120 




CTACAACTGC 


AGTCAATCCA 


TCTTACCCCT 


GGAGCACGGC 


TCCATATACA 


TACCTTCCTC 


180 


10 


CTATGTAGAC 


AGCCACCAT G 


AATATCCAGC 


CATGACATTC 


TATAGCCCTG 


CTGTGATGAA 


240 




TTACAGCATT 


CCCAGCAATG 


TCACTAACTT 


GGAAGGTGGG 


CCTGGTCGGC 


AGACCACAAG 


300 


15 


CCCAAATGTG 


TTGTGGCCAA 


CACCTGGGCA 


CCTTTCTCCT 


TTAGTGGTCC 


ATCGCCAGTT 


360 


20 


ATCACATCTG 


TATGCGGAAC 


CTCAAAAGAG 


TCCCTGGTGT 


GAAGCAAGAT 


CGCTAGAACA 


420 




CACCTTACCT 


GTAAACAGAG 


AGACACTGAA 


AAGGAAGGTT 


AGTGGGAACC 


GTTGCGCCAG 


480 


25 


CCCTGTTACT 


GGTCCAGGTT 


CAAAGAGGGA 


TGCTCACTTC 


TGCGCTGTCT 


GCAGCGATTA 


540 




CGCATCGGGA 


TATCACTATG 


GAGTCTGGTC 


GTGTGAAGGA 


TGTAAGGCCT 


TTTTTAAAAG 


*600 


30 


AAGCATTCAA 


GGACATAATG 


ATTATATTTG 


TCCAGCTACA 


AATCAGTGTA 


CAATCGATAA 


660 




AAACCGGCGC 


AAGAGCTGCC 


AGGCCTGCCG 


ACTTCGGAAG 


TGTTACGAAG 


TGGGAATGGT 


720 



35 



40 



45 



50 



55 



31 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



GAAGTGTGGC TCCCGGAGAG AGAGATGTGG 
CGACGAGCAG CTGCACTGTG CCGGCAAGGC 
GCGGGAGCTG CTGCTGGACG CCCTGAGCCC 
TGAGCCGCCC CATGTGCTGA TCAGCCGCCC 
gatgtccctg ACCAAGTTGG CCGACAAGGA 
GATTCCCGGC TTTGTGGAGC TCAGCCTGTT 
GATGGAGGTG TTAATGATGG GGCTGATGTG 
CTTTGCTCCA GATCTTGTTC TGGACAGGGA 
AATCTTTGAC ATGCTCCTGG CAACTACTTC 
AGAATATCTC TGTGTCAAGG CCATGATCCT 
AGCGACCCAG GATGCTGACA gcagccggaa 
TGCTTTGGTT TGGGTGATTG CCAAGAGCGG 
GGCTAACCTC ctgatgctcc TGTCCCACGT 
TCTGCTCAAC ATGAAGTGCA AAAATGTGGT 
GAATGCCCAC GTGCTTCGCG GGTGCAAGTC 
AGAGGACAGT AAAAGCAAAG AGGGCTCCCA 
GAGGTGAACT GGCCCACAGA GGTCACAAGC 
TGGGCTTCAT CTTTCTGCTG TGTGGTCCCT 
CATCCTTCCC TCCACCTTCC CAACTCTCAG 



GTACCGCCTT GTGCGGAGAC AGAGAAGTGC 
CAAGAGAAGT GGCGGCCACG CGCCCCGAGT 
CGAGCAGCTA GTGCTCACCC TCCTGGAGGC 
CAGTGCGCCC TTCACCGAGG CCTCCATGAT 
GTTGGTACAC ATGATCAGCT GGGCCAAGAA 
CGACCAAGTG CGGCTCTTGG AGAGCTGTTG 
GCGCTCAATT GACCACCCCG GCAAGCTCAT 
TGAGGGGAAA TGCGTAGAAG GAATTCTGGA 
AAGGTTTCGA GAGTTAAAAC TCCAACACAA 
GCTCAATTCC AGTATGTACC CTCTGGTCAC 
GCTGGCTCAC TTGCTGAACG CCGTGACCGA 
CATCTCCTCC CAGCAGCAAT CCATGCGCCT 
CAGGCATGCG AGTAACAAGG GCATGGAACA 
CCCAGTGTAT GACCTGCTGC TGGAGATGCT 
CTCCATCACG GGGTCCGAGT GCAGCCCGGC 
GAACCCACAG TCTCAGTGAC GCCTGGCCCT 
TGAAGCGTGA ACTCCAGTGT GTCAGGAGCC 
CATTTGGTGA TGGCAGGCTT GGTCATGTAC 
GAGTCGGTGT GAGGAAGCCA TAGTTTCCCT 



780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

V 

1800 

1860 
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TGTTAGCAGA GGGACATTTG AATCGAGCGT TTCCACAC 1898 

5 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 530 amino acids 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

is (jj) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

20 Met Asp lie Lys Asn Ser Pro Ser Ser Leu Asn Ser Pro Ser Ser Tyr 

i5 10 15 

Asn Cys Ser Gin Ser lie Leu Pro Leu Glu His Gly Ser lie Tyr lie 
20 25 30 



Pro Ser Ser Tyr Val Asp Ser His His Glu Tyr Pro Ala Met Thr Phe 

30 3 5 4 0 4 5 

Tyr Ser Pro Ala Val Met Asn Tyr Ser lie Pro Ser Asn Val Thr Asn 

50 55 60 

35 

Leu Glu Gly Gly Pro Gly Arg Gin Thr Thr Ser Pro Asn Val Leu Trp 
65 70 75 80 

40 

Pro Thr Pro Gly His Leu Ser Pro Leu Val Val His Arg Gin Leu Ser 
85 90 95 

45 

His Leu Tyr Ala Glu Pro Gin Lys Ser Pro Trp Cys Glu Ala Arg Ser 
100 105 HO 

50 Leu Glu His Thr Leu Pro Val Asn Arg Glu Thr Leu Lys Arg Lys Val 
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115 



120 



125 



5 



Ser Gly Asn Arg Cys Ala Ser Pro Val Thr Gly Pro Gly Ser Lys Arg 
130 135 140 



Asp Ala His Phe Cys Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His 
145 150 155 160 



15 



Tyr Gly Val Trp Ser Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser 
165 170 175 



20 



lie Gin Gly His Asn Asp Tyr lie Cys Pro Ala Thr Asn Gin Cys Thr 
180 185 190 



He Asp Lys Asn Arg Arg Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys 
195 200 205 



25 



Cys Tyr Glu Val Gly Met Val Lys Cys Gly Ser Arg Arg Glu Arg Cys 
210 215 220 



Gly Tyr Arg Leu Val Arg Arg Gin Arg Ser Ala Asp Glu Gin Leu His 
225 230 235 240 



35 



Cys Ala Gly Lys Ala Lys Arg Ser Gly Gly His Ala Pro Arg Val Arg 
245 250 255 



40 



Glu Leu Leu Leu Asp Ala Leu Ser Pro Glu Gin Leu Val Leu Thr Leu 
260 265 270 



Leu Glu Ala Glu Pro Pro His Val Leu lie Ser Arg Pro Ser Ala Pro 
275 280 285 

45 



50 



Phe Thr Glu Ala Ser Met Met Met Ser Leu Thr Lys Leu Ala Asp Lys 
290 295 300 



Glu Leu Val His Met lie 
305 310 



Ser Trp Ala Lys Lys lie Pro Gly Phe Val 
315 320 



Glu Leu Ser Leu Phe Asp Gin Val Arg Leu Leu Glu Ser Cys Trp Met 
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5 



10 



15 



325 330 335 

Glu Val Leu Met Met Gly Leu Met Trp Arg Ser He Asp His Pro Gly 
340 345 350 

Lys Leu lie Phe Ala Pro Asp Leu Val Leu Asp Arg Asp Glu Gly Lys 

355 360 365 

Cys Val Glu Gly lie Leu Glu lie Phe Asp Met Leu Leu Ala Thr Thr 
370 375 380 



20 



Ser Arg Phe Arg Glu Leu Lys Leu Gin His Lys Glu Tyr Leu Cys Val 

385 390 395 400 

Lys Ala Met lie Leu Leu As n Ser Ser Met Tyr Pro Leu Val Thr Ala 

405 410 415 



25 



30 



Thr Gin Asp Ala Asp Ser Ser Arg Lys Leu Ala His Leu Leu Asn Ala 
420 425 430 

Val Thr Asp Ala Leu Val Trp Val lie Ala Lys Ser Gly lie Ser Ser 
435 440 445 



35 



40 



45 



Gin Gin Gin Ser Met Arg Leu Ala Asn Leu Leu Met Leu Leu Ser His 
450 455 460 



Val Arg His Ala Ser Asn Lys Gly Met Glu His Leu Leu Asn Met Lys 

465 470 475 460 

Cys Lys Asn Val Val Pro Val Tyr Asp Leu Leu Leu Glu Met Leu Asn 

485 490 495 

Ala His Val Leu Arg Gly Cys Lys Ser Ser lie Thr Gly Ser Glu Cys 
500 505 510 

Ser Pro Ala Glu Asp Ser Lys Ser Lys Glu Gly Ser Gin Asn Pro Gin 
515 520 525 



Ser Gin 
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(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

io (ii) MOLECULE TYPE: other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



GTGCGGATCC TCTCAAGACA TGGATATAAA 30 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



AGTAACAGGG CTGGCGCAAC GGTTC 
35 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

40 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
45 

(ii) MOLECULE TYPE: other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 

50 

ACTGGCGATG GACCACTAAA GG 



55 Claims 

1. Isolated estrogen receptor having an N-terminal domain, a DNA-binding domain, and a ligand-binding domain, 
wherein the amino acid sequence of said DNA-binding domain exhibits at least 80% homology with the amino acid 



' 25 



22 
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sequence shown in SEQ ID NO:3 and the amino acid sequence of said ligand-binding domain of said estrogen 
receptor exhibits at least 70 % homology with the amino acid sequence shown in SEQ ID NO:4, 
provided that the estrogen receptor does not have the amino acid sequence: 



5 





MTFYS 


PAVMN 


YSVPG 


STSNL 


DGGPV 


RESTS 


PNVLW 


PTSGH 


LSPLA 




THCQS 


SLLYA 


EPQKS 


PWCEA 


RSLEH 


TLPVN 


RETLK 


RKLSG 


SSCAS 




PVTSP 


NAKRD 


AHFCP 


VCSDY 


ASGYH 


YGVWS 


CEGCK 


AFFKR 


SIQGH 


10 


NDYIC 


PATNQ 


CTIDK 


NRRKS 


CQACR 


LRKCY 


EVGMV 


KCGSR 


RERCG 




YRIVR 


RQRSS 


SEQVH 


CLSKA 


KRNGG 


HAPRV 


KELLL 


STLSP 


EQLVL 




TLLEA 


EPPNV 


LVSRP 


SMPFT 


EASMM 


MSLTK 


LADKE 


LVHMI 


GWAKK 


15 


IPGFV 


ELSLL 


DQVRL 


LESCW 


MEVLM 


VGLMW 


RSIDH 


PGKLI 


FAPDL 




VLDRD 


EGKCV 


EGILE 


IFDML 


LATTS 


RFREL 


KLQHK 


EYLCV 


KAMIL 




LNSSM 


YPLAS 


ANQEA 


ESSRK 


LTHLL 


NAVTD 


ALVWV 


IAKSG 


ISSQQ 


20 


QSVRL 


ANLLM 


LLSHV 


RHISN 


KGMEH 


LLSMK 


CKNW 


PVYDL 


LLEML 




NAHTL 


RGYKS 


SISGS 


ECSST 


EDSKN 


KESSQ 


NLQSQ 







2. Isolated estrogen receptor according to claim 1, characterised In that the amino acid sequence of said DNA- 

25 binding domain exhibits at least 90% homology with the amino acid sequence shown in SEQ ID NO: 3. 

3. Isolated estrogen receptor according to anyone of claims 1 -2, characterised In that the amino acid sequence of 
said ligand-binding domain exhibits at least 75% homology with the amino acid sequence shown in SEQ ID NO: 4. 

30 4. Isolated estrogen receptor according to anyone of claims 1 -3, characterised In that said estrogen receptor com- 

prises the amino acid sequence of SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 21 or SEQ ID NO: 25. 

5. Isolated DNA encoding an estrogen receptor according to claims 1 -4. 

35 6. Isolated DNA according to claim 5, characterised in that said DNA comprises the nucleic acid sequence of SEQ 

ID NO: 1 , SEQ ID NO:2, SEQ ID NO:20 or SEQ ID NO:24. 

7. A recombinant expression vector comprising the DNA according to claim 5 or 6. 

40 8. A cell transfected with DNA according to claim 5 or 6 or an expression vector according to claim 7. 

9. A cell according to claim 8 which is a stable transfected cell line which expresses the estrogen receptor according 
to any of the claims 1-4. 

4 5 10. Use of a DNA according to claim 5 or 6, an expression vector according to claim 7, a cell according to claim 8 or 

9, a receptor according to any one of claims 1-4, in a screening assay for identification of new drugs. 

11 . Method of identifying functional ligands for a receptor according to any one of claims 1-4, said method comprising 
the steps of 

50 

a) introducing into a suitable host cell 1) DNA according to claims 5 or 6, and 2) a suitable reporter gene 
functionally linked to an operative hormone responsive element (HRE), said HRE being able to be activated 
by the DNA-binding domain of the receptor encoded by said DNA; 

b) bringing said host cell from a) into contact with potential ligands which will possibly bind to the ligand-binding 

55 domain of the receptor encoded by said DNA from step a); and 

c) monitoring the expression of the receptor encoded by said reporter gene of step a). 
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Patentanspriiche 

1. Isolierter Ostrogenrezeptor mit einer N-terminalen Domane, eine DNA-Bindungsdomane und einer Ligandenbin- 
dungsdomane, worin die Aminosauresequenz der genannten DNA-Bindungsdomane mindestens 80% Homologie 
5 mit der in SEQ ID Nr.:3 gezeigten Aminosauresequenz besitzt und die Aminosauresequenz der genannten Ligan- 

denbindungsdomane des genannten Ostrogenrezeptors mindestens 70% Homologie mit der in SEQ ID Nr.:4 ge- 
zeigten Aminosauresequenz aufweist, vorausgesetzt, dass der Ostrogenrezeptor nicht die folgenden Aminosau- 
resequenz besitzt: 



10 





MTFYS 


PAVMN 


YSVPC3 


STSNL 


DGGPV 


RLSTS 


PNVLW 


PTSGH 


LSPLA 




THCQS 


SLLYA 


EPQKS' 


PWCEA 


RSLEH 


TLPVN 


RETLK 


RKLSG 


SSCAS 




PVTSP 


NAKRD 


AHFCP 


VCS DY 


ASGYH 


YGVWS 


CEGCK 


AFFKR 


SIQGH 


15 


NDYIC PATNQ 


CTIDK 


NRRKS 


CQACR 


LRKCY 


EVGMV 


KCGSR 


RERCG 




YRIVR 


RQRSS 


SEQVH 


CLSKA 


KRNGG 


HAPRV 


KELLL 


STLSP 


EQLVL 




TLLEA 


fcPPNV 


LVSRP 


SMPET 


EASMM 


MSLTK 


LADKE 


LVHMI 


GWAKK 


20 


IPGFV 


ELSLL 


DQVRL 


LESCW 


MEVLM 


VGLMW 


RSIDH 


PGKLI 


FAPDL . 




VLDRD 


EGKCV 


EGILE 


IFDML 


XiATTS 


RFREL 


. KLQHK 


EYLCV 


KAMIL 




LNSSM 


YPLAS 


ANQEA 


ESSRK 


LTHLL 


NAVTD 


ALVWV 


IAKSG 


ISSQQ 


25 


QSVRL ANLLM 


LLSHV 


RHISN 


KGMEH 


LLSMK 


CKNVV 


PVYDL 


LLEML 




NAHTL 


RGYKS 


SISGS 


ECS ST 


EDSKN 


KESSQ 


NLQSQ 







2. Isolierter Ostrogenrezeptor nach Anspruch 1 , dadurch gekennzeichnet, dass die Aminosauresequenz der ge- 

30 nannten DNA-Bindungsdomane mindestens 90% Homologie mit der in der SEQ ID. Nr.:3 gezeigten Aminosaure- 

sequenz hat. 

3. Isolierter Ostrogenrezeptor nach einem der Anspriiche 1 -2, dadurch gekennzeichnet, dass die Aminosaurese- 
quenz der genannten Ligandenbindungsdomane mindestens 75% Homologie mit der in der SEQ ID. Nr.:4 gezeig- 

35 ten Aminosauresequenz hat. 

4. Isolierter Ostrogenrezeptor nach einem der Anspriiche 1 -3, dadurch gekennzeichnet, dass der genannte Ostro- 
genrezeptor die Aminosauresequenz der SEQ ID Nr.:5, SEQ ID Nr.:6, SEQ IDNr.:21 Oder SEQ ID Nr.:25 umfasst. 

4 0 5. Isolierte DNA, welche fur einen Ostrogenrezeptor nach den Anspriichen 1 -4 codiert. 

6. Isolierte DNA nach Anspruch 5, dadurch gekennzeichnet, dass die genannte DNA die Nukleinsauresequenz 
der SEQ ID Nr.:1 , SEQ ID Nr.:2, SEQ ID Nr.:20 Oder SEQ ID Nr.:24 umfasst. 

45 7 - Rekombinanter Expressionsvektor, welcher die DNA nach einem der Anspriiche 5 Oder 6 umfasst. 

8. Zelle, welche mit der DNA nach einem der Anspriiche 5 Oder 6 Oder einem Expressionsvektor nach Anspruch 7 
transfiziert ist. 

50 9. Zelle nach Anspruch 8, welche eine stabil transfizierte Zelllinie ist, die den Ostrogenrezeptor nach einem der 

Anspriiche 1 -4 exprimiert. 

10. Verwendung einer DNA nach einem der Anspriiche 5 Oder 6, eines Expressionsvektors nach Anspruch 7, einer 
Zelle nach Anspruch 8 Oder 9, einem Rezeptor nach einem der Anspriiche 1-4 in einem Screening-Assay zur 

ss Identifizierung neuer Arzneimittel. 

11. Verfahren zur Identifizierung funktionaler Liganden fur einen Rezeptor nach einem der Anspriiche 1-4, worin das 
genannte Verfahren die folgenden Schritte umfasst: 
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a) Einbringen in eine geeignete Wirtszelle von 1) DNA nach den Anspruchen 5 dder 6 und 2) von einem 
geeigneten Reporter-Gen, das funktional mit einem operativen responsiven Hormonelement (H RE) verbunden 
ist, wobei das genannte HRE fahig ist, von der DNA-Bindungsdomanedes von dergenannten DNA codierten 
Rezeptors aktiviert zu werden; 

s b) in Kontakt Bringen der genannten Wirtszelle von a) mit potentiellen Liganden, die moglicherweise an die 

Ligandenbindungsdomane des von der genannten DNA codierten Rezeptors von Schritt a) binden und 
c) Uberwachen der Expression des Rezeptors, dervondem genannten Reportergen von Schritt a) codiertwird. 



10 Revendications 

1 . R6cepteur d'oestrogene isoie ayant un domaine N-terminal, un domaine de liaison d'ADN, et un domaine de liaison 
de ligand, dans lequel la sequence d’acides amines dudit domaine de liaison d'ADN presente au moins 80 % 
d'homologie avec la sequence d'acides amines indiquec dans SEQ ID N°3 et la sequence d'acides amines dudit 
15 domaine de liaison de ligand dudit r6cepteur d'oestrogene pr§sente au moins 70 % d'homologie avec la sequence 
d'acides amines indiqu6e dans SEQ ID N°4, 

k condition que le rScepteur d'oestrogene ne possede pas la sequence d'acides amines: 



20 


MTFYS 


PAVMN 


YSVPG STSNL 


DGGPV 


RLSTS 


PNVLW 


PTSGH 


r-SPLA 




THCQS 


sllya 


EPQKS PWCBA 


RSLEH 


TI.PVN 


RETLK 


RKLSG 


SSCAS 




PVTSP 


NAKRD 


AHFCP VCSDY 


ASGYH 


YGVWS 


CEGCK 


AFFKR 


SIQGH 




NDYIC 


PATNQ 


CT1DK NRRKS 


CQACR 


LRKCY 


EVGMV 


KCGSR 


RERCG 




YRTVR 


RQRSS 


SEQVH CLSKA 


KRNGG 


HAPRV 


KELLL 


STLSP 


EQLVL 




TLLEA 


HPPNV 


LVSRP SMPFT 


EASMM 


MSLTK 


LADKE 


LVHMI 


GWAKK 




IPGFV 


elsll 


DQVRL LESCW 


MEVLM 


VGLMW RSIDH 


PGKLI 


FAPDL 


30 


VLDRD EGKCV 


F.GIL £ IFDML 


LATTS 


RFREL 


KT,QHK 


EYLCV 


KAMTI. 




I.NSSM 


YPLAS 


ANQEA ESSRK 


LTHLL 


NAVTD 


ALVWV 


lAKSG 


ISSQQ 




QSVRL 


ANLLM LLSHV RHISN 


KGMEH 


LLSMK 


CKNW 


PVYDL 


MEML 


35 


NAKTL RGYKS 


SISGS ECSST 


EDSKN 


KESSQ 


NLQSQ 







2. R6cepteur d'oestrogene isoie selon la revendication 1 , caracterlse en ce que la sequence d'acides amines dudit 
domaine de liaison d’ADN presente au moins 90 % d’homologie avec la sequence d'acides amines indiquSe dans 
SEQ ID N°3. 

40 

3. R6cepteur d'oestrogene isoie selon I'une quelconquedes revendications 1 et 2, caracterlse ence que la sequence 
d'acides amines dudit domaine de liaison de ligand prOsente au moins 75 % d'homologie avec la sequence d'acides 
amin6s indiqu6e dans SEQ ID N°4. 

45 4. R6cepteur d'oestrogene isoie selon I'une quelconque des revendications 1 k 3, caracterlse en ce que ledit r6- 

cepteur d'oestrogene comprend une sequence d'acides amines de SEQ ID N°5, SEQ ID N°6 SEQ ID N°21 ou 
SEQ ID N°25. 

5. ADN isoie codant pour un r6cepteur d'oestrogene selon I'une quelconque des revendications 1 k 4. 

50 

6. ADN isoie selon la revendication 5, caracterlse en ce que ledit ADN comprend la sequence d'acide nucieique de 
SEQ ID N° 1 , SEQ ID N°2, SEQ ID N°20 ou SEQ ID N°24. 



7. Vecteur d’expression recombinant comprenant I'ADN selon I'une quelconque des revendications 5 ou 6. 

8. Cellule transferee avec I'ADN selon I'une quelconque des revendications 5 ou 6 ou un vecteur depression selon 
la revendication 7. 
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9. Cellule selon la revendication 8 qui est une lignde cellulaire transfectde stable qui exprime le rdcepteur d'oestro- 
gdne selon I'une quelconque des revendications 1 & 4. 

10. Utilisation d'un ADN selon I’une quelconque des revendications 5 ou 6, d'un vecteur d'expression selon la reven- 
dication 7, d'une cellule selon la revendication 8 ou 9, d’un rdcepteur selon I'une quelconque des revendications 
1 d 4, dans un test de criblage pour identifier des nouveaux mddlcaments. 

1 1 . Procddd pour identifier des ligands fonctionnels pour un rdcepteur selon I'une quelconque des revendications 1 a 
4, ledit procddd comprenant les dtapes de 

(a) introduction dans une cellule h6te ap prop ride: 

(i) d'un ADN selon I'une quelconque des revendications 5 ou 6, et 

(ii) d'un gdne indicateur approprid lidfonctionnellement d un element de rdponse pour la fonction hormo- 
nale (HRE), ledit HRE pouvant dtre activd par le domaine de liaison d'ADN du rdcepteur codd par ledit 
ADN ; 

(b) mise en contact de ladite cellule hfite de (a) avec des ligands potentiels qui sont susceptibles de se lier 

au domaine de liaison du ligand du rdcepteur code par ledit ADN de I'dtape (a) ; et 

(c) suivi de I'expression du rdcepteur codd par ledit gdne indicateur de I'dtape (a). 
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Transient transfection of CHO celts with Estrogen Receptors Alpha and Beta 
incubation with Estradiol and ICt 



20 




no hormone E2 10-9 E2 10-9 + ICI 10-6 




3 
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ERg and ERft RT PCR on tissue-representative cell lines 




1. Ishikawa 

2. HEC-1A 

3. RL95-2 

4. ECC-1 

5. SaOS-2 



6. HOS 

7. U2-DS 

8. MG-63 

9. MCF-7 

10. T47-D 



11. HS-760T 

12. SW-954 

13. Hep- G2 

14. CaCo 

15. HISM 



16. HUV-EC-C 

17. BAEC-1 

18. A10 

19. A7R5 

20. CavaSMC 



21. RASMC 
bl. blank 



Figure 4 
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